Mining Maximal Flexible Patterns in a Sequence
نویسندگان
چکیده
We consider the problem of enumerating all maximal flexible patterns in an input sequence database for the class of flexible patterns, where a maximal pattern (also called a closed pattern) is the most specific pattern among the equivalence class of patterns having the same list of occurrences in the input. Since our notion of maximal patterns is based on position occurrences, it is weaker than the traditional notion of maximal patterns based on document occurrences. Based on the framework of reverse search, we present an efficient depth-first search algorithm MaxFlex for enumerating all maximal flexible patterns in a given sequence database without duplicates in O(||T || × |Σ|) time per pattern and O(||T ||) space, where ||T || is the size of the input sequence database T and |Σ| is the size of the alphabet on which the sequences are defined. This means that the enumeration problem for maximal flexible patterns is shown to be solvable in polynomial delay and polynomial space.
منابع مشابه
Effcient Algorithms for Mining Maximal Flexible Patterns in Texts and Sequences
In this paper, we study the maximal pattern discovery problem in a given sequence for the class ERP of flexible patterns with applications to text mining, where a flexible pattern is a sequence of constant and wildcards for possibly empty strings such as AB*B*ABC, and also known as erasing regular patterns. We first discuss the framework of optimal pattern discovery for predictive mining and te...
متن کاملQuery Driven Sequence Pattern Mining
The discovery of frequent patterns present in biological sequences has a large number of applications, ranging from classification, clustering and understanding sequence structure and function. This paper presents an algorithm that discovers frequent sequence patterns (motifs) present in a query sequence in respect to a database of sequences. The query is used to guide the mining process and th...
متن کاملHigh Fuzzy Utility Based Frequent Patterns Mining Approach for Mobile Web Services Sequences
Nowadays high fuzzy utility based pattern mining is an emerging topic in data mining. It refers to discover all patterns having a high utility meeting a user-specified minimum high utility threshold. It comprises extracting patterns which are highly accessed in mobile web service sequences. Different from the traditional fuzzy approach, high fuzzy utility mining considers not only counts of mob...
متن کاملAn Efficient Approach to Mining Maximal Contiguous Frequent Patterns from Large DNA Sequence Databases
Mining interesting patterns from DNA sequences is one of the most challenging tasks in bioinformatics and computational biology. Maximal contiguous frequent patterns are preferable for expressing the function and structure of DNA sequences and hence can capture the common data characteristics among related sequences. Biologists are interested in finding frequent orderly arrangements of motifs t...
متن کاملA Top-Down Algorithm for Mining Maximal Traversal Paths in Web Log Sessions
Mining of frequent traversal paths in web logs is an application of sequence mining and useful with many applications that include web recommendation, caching, pre-fetching etc. Most of the existing algorithms follow a bottom-up approach to mine sequence patterns in a database. In this paper, a fast top-down algorithm is presented to discover maximal traversal paths which are contiguous sequenc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007